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Instruments for radio astronomical observations have come 
a long way. While the first telescopes were based on very 
large dishes and 2-antenna interferometers, current instruments 
consist of dozens of steerable dishes, whereas future instru- 
ments will be even larger distributed sensor arrays with a 
\ hierarchy of phased array elements. For such arrays to provide 
i meaningful output (images), accurate calibration is of critical 
importance. Calibration must solve for the unknown antenna 
■ gains and phases, as well as the unknown atmospheric and 
' ionospheric disturbances. Future telescopes will have a large 
number of elements and a large field of view. In this case 
[ the parameters are strongly direction dependent, resulting in a 
large number of unknown parameters even if appropriately 
constrained physical or phenomenological descriptions are 
used. This makes calibration a daunting parameter estimation 
task. 

I. Introduction 

Astronomers study the physical phenomena outside the 
Earth's atmosphere by observing cosmic particles and elec- 
tromagnetic waves impinging on the Earth. Each type of ob- 
servation provides another perspective on the universe thereby 
unraveling some mysteries while raising new questions. Over 
the years, astronomy has become a true multi- wavelength 
science. A nice demonstration is provided in Fig. [T] In this 
image, the neutral hydrogen gas observed with the Wester- 
bork Synthesis Radio Telescope (WSRT) exhibits an intricate 
extended structure that is completely invisible in the optical 
image from the Sloan digital sky survey [1]. The radio 
observations therefore provide a radically different view on 
the dynamics of this galaxy. 

Images like Fig. [T] are only possible if the instruments used 
to observe different parts of the electromagnetic spectrum 
provide a similar resolution. This poses quite a challenge since 
the resolution of any telescope is determined by the ratio 
of the wavelength and the telescope diameter. Consequently, 
the aperture of radio telescopes has to be 5 to 6 orders of 
magnitude larger than that of an optical telescope to provide 
comparable resolution, i.e. radio telescopes should have an 
aperture of several hundreds of kilometers. Although it is not 
feasible to make a dish of this size, it is possible to synthesize 
an aperture by building an interferometer, i.e., an array. 

Radio astronomy started with the discovery by Karl Jansky, 
at Bell Telephone Laboratories in 1928, that the source of 
unwanted interference in his short-wave radio transmissions 
actually came from the Milky Way. For this, he used the 
large antenna mounted on a turntable shown in Fig. [Ja). 




Fig. I. Image of the spiral galaxy NGC 5055, showing the structure of the 
neutral hydrogen gas observed with the Westerbork Synthesis Radio Telescope 
(blue) superimposed on an optical image of the same galaxy from the Sloan 
digital sky survey (white) [2]. 



Subsequent single-antenna instruments were based on larger 
and larger dishes, culminating in the Arecibo telescope (Puerto 
Rico 1960, 305 m non-steerable dish) and the Effelsberg 
telescope (Bonn, Germany, 1972, 100 m steerable dish. Fig. 
Wib)). Making larger steerable dishes is not practical. 

An interferometer measures the correlation between two 
antennas spaced at a certain distance. Initially used to study 
a single source passing over the sky, the principle was used 
in optical astronomy in the Michelson stellar interferometer 
(1890, 1920); the first radio observations using two dipoles 
were done by Ryle and Vonberg in 1946 [3]. Examples of 
subsequent instruments are: the Cambridge One Mile Tele- 
scope in Cambridge, UK (1964, 2 fixed and one movable 18.3 
m dishes); the 3 km WSRT in Westerbork, The Netherlands 
(1970, 12 fixed and 2 movable 25 m dishes. Fig. [^c)); the 
36 km Very Large Array (VLA) in Socorro, New Mexico, 
USA (1980, 27 movable 25 m dishes. Fig. [^d)); the 25 km 
Giant Meter- Wave Radio Telescope (GMRT) in Pune, India 
(1998, 30 dishes with 45 m diameter). These telescopes use 
the Earth rotation to obtain a sequence of correlations for 
varying antenna baseline orientations relative to the desired sky 
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Fig. 2. The radio telescopes of (a) Jansky, (6) Effelsberg, (c) WSRT, (d) VLA, (e) concept for ALMA. 



image field, resulting in high-resolution images via synthesis 
mapping. Even larger baselines (up to a few thousand km) 
were obtained by combining these instruments into a single 
instrument using a technique called VLB I (very long baseline 
interferometry), where the telescope outputs are time- stamped 
and post-processed by correlation at a central location. An 
extensive historical overview is presented in [4]. In the near 
future, astronomers are building even larger arrays, such as 
the Atacama Large Millimeter Array (ALMA, Chile, 2011, 50 
movable 12 m dishes with possible extension to 64 dishes. Fig. 
[2Ke)), the Low Frequency Array (LOFAR, The Netherlands 
(2009, about 30,000 dipole antennas grouped in 36 stations. 
Fig. Q, and the Square Kilometer Array (SKA, 2020+, Fig. 
[5]). A recent issue of the Proceedings of the IEEE (Vol.97, No. 
8, Aug. 2009) provides overview articles discussing many of 
the recent and future telescopes. 

High-resolution synthesis imaging would not be possible 
without accurate calibration. Initially, the complex antenna 
gains and phases are unknown; they have to be estimated. 
Moreover, propagation through the atmosphere and ionosphere 
causes additional phase delays that may create severe distor- 
tions. Finally, image reconstruction or map making is governed 
by finite sample effects: we can only measure correlations 
on a small set of baselines. Solving for these three effects 
is intertwined and creates very interesting signal processing 
problems. In this overview paper, we focus on the calibration 
aspects, whereas imaging is covered in a companion paper [5]. 
The examples provided in this paper are generally borrowed 
from low frequency (< 1.5 GHz) instruments, but the frame- 
work presented is applicable to high frequency instruments 
like ALMA as well. 

II. Interferometry and image formation 

The concept of interferometry is illustrated in Fig. [3j An 
interferometer measures the spatial coherency of the incoming 



electromagnetic field. This is done by correlating the signals 
from the individual receivers with each other. The correlation 
of each pair of receiver outputs provides the amplitude and 
phase of the spatial coherence function for the baseline defined 
by the vector pointing from the first to the second receiver. 
These correlations are called the visibilities. 

Ideal measurement model 

To describe this mathematically, let us assume that there are 
J array elements called "antennas" Q pointed at a field with Q 
point sources. Stack the sampled antenna signals for the k-\h 
narrowband [6] frequency channel centered at frequency fk 
into a J X 1 vector x(n). For notational convenience, we will 
drop the dependence on frequency from the notation in most 
of the paper. Then we can model x(n) as 

Q 

= X^^gH-^gH + n(n) (1) 

where Sq{n) is the signal from the q-\h source at time sample 
n and frequency fk, ^q{n) is the array response vector for 
this source, and n(n) is the noise sample vector. Sq{n) and 
n(n) are baseband complex envelope representations of zero 
mean wide sense stationary white Gaussian random processes 
sampled at the Nyquist rate. 

Due to Earth rotation the geometrical delay component of 
Siq(n) changes slowly with time, which is a critical feature 
exploited in synthesis imaging. Let N be the number of time 
samples in a short term integration (STI) interval. We assume 
that Siq{n) is (relatively) constant over such an interval, so 
that, for the m-th interval, x(n) is wide sense stationary over 
{m — 1)N < n < mN — 1. A single STI autocovariance is 
defined as 

= ^{x(n) x^(n)} = A^E, + E,, (2) 

^As discussed in Sec. [nil each element may be a phased array itself! 
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Fig. 3. Schematic overview of a radio interferometer. 



where has size J x J, 

Am = [ai((m-l)7V),--- ,ag((m-l)iV)] 

= diag{[a^,--- ,cr|]} 

En = ^{n(n)n^(n)} = diag{[<i,... 

Here, is the variance of the q-th source. Noise is assumed 
to be independent but not evenly distributed across the array, 
and the noise variances a'^ j are unknown. With some abuse 
of notation, the subscript n in 5]^ and anj indicates "noise". 

Each matrix element of (Rm)^j represents the interferomet- 
ric correlation along the baseline vector between the antennas 
i and j in the array. The corresponding short term integration 
sample covariance estimate is 

^ mN-l 

^^ = 7V ^ x(n)x^(n), 

n=(m-l)Ar 

and this is what the interferometer measures for subsequent 
processing. In practical instruments, the short-term integration 
interval is often in the order of 1 to 30 seconds, the total 
observation can span up to 12 hours, and the number of 
frequency bins is highly design-dependent ranging in order 
of magnitude from 10 to 10^. 

Under ideal circumstances, the array response matrix 
is equal to a phase matrix due entirely to the geometrical 
delays associated with the array and source geometry, and 
accurately known, at least for the calibration sources. The 
columns of K^, denoted by km,q = 1, • • • , Q), are often 
called the "Fourier kernel" and are given by 

km,g expjj— — Z Pm,g} 

where c is the speed of light, is the position column vector 
for the j-th array element, and Pm,g is a unit length vector 
pointing in the direction of source q during STI snapshot m. 

Image formation 

Ignoring the additive noise, the measurement equation ([J]), 
in its simplest form, can be written as 

Q 



3 




Fig. 4. A radio interferometer where stations consisting of phased array 
elements replace telescope dishes. The ionosphere adds phase delays to the 
signal paths. If the ionospheric electron density has the form of a wedge, it 
will simply shift the apparent positions of all sources. 

where (Rm)ij is the measured correlation between antennas i 
and j at STI interval m, /(•) is the brightness image {map) of 
interest, and is the unit direction vector of the g-th source 
at a fixed reference time. It describes the relation between 
the observed visibilities and the desired source brightness 
distribution (intensities), and it has the form of a Fourier 
transform; it is known in radio astronomy as the Van Cittert- 
Zernike theorem [4], [7]. Image formation {map making) is 
essentially the inversion of this relation. Unfortunately, we 
have only a finite set of observations, therefore we can only 
obtain a dirty image: 

= Yl ^(p?) ^(p - p?) 

q 

where p corresponds to a pixel in the image, and where the 
dirty beam, also referred to as synthesized beam or point 
spread function (psf), is given by 

Id{p) is the desired image convolved with the dirty beam, 
essentially a non-ideal point spread function due to the finite 
sample effect. Every point source excites a beam B{-) centered 
at its location p^. Deconvolution is the process of recovering 
/(•) from Id{-) using knowledge of the dirty beam. A standard 
algorithm for doing this is CLEAN [8]. The autocorrelations 
are often not used in the image formation process to reduce 
the impact of errors in the calibration of the additive noise on 
the resulting image. More details are shown in [9] and in the 
companion paper [5]. 
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Non-ideal measurements 

Although the previous equations suggest that it is rather 
straightforward to make an image from radio interferometer 
data, there are several effects that make matters more compli- 
cated: 

• Receiver element complex gain variations. Astronomical 
signals are very weak, and radio telescopes therefore need 
to be very sensitive. This sensitivity is inversely propor- 
tional to the (thermal) noise. This dictates the use of low- 
noise amplifiers, which are sometimes even cryogenically 
cooled. Variations in environmental conditions of the 
receiver chain, such as temperature, cause amplitude and 
phase changes in the receiver response. Signals must also 
be propagated over long distances to a central processing 
facility and, depending on where digitization occurs, there 
can be significant phase and gain variations over time 
along these paths. 

• Instrumental response. The sensitivity pattern of the indi- 
vidual elements, the primary beam, of an interferometer 
will never be perfect. Although it is steered towards the 
source of interest, the sensitivity in other directions (the 
side lobe response) on the sky will not be zero. This poses 
a challenge in the observation of very weak sources which 
may be hampered by signals from strong sources that are 
received via the side lobes, but are still competing with 
the signal of interest. The algorithms correcting for the 
instrumental response assume that the sensitivity pattern 
is known. This may not be true with the desired accuracy 
if the array is not yet calibrated. 

• Propagation effects. Ionospheric and tropospheric tur- 
bulence cause time-varying refraction and diffraction, 
which has a profound effect on the propagation of radio 
waves. As illustrated in Fig. (H in the simplest cases this 
leads to a shift in the apparent position of the sources. 
More generally, this leads to image distortions that are 
comparable to the distortions one sees when looking at 
ceiling lights from the bottom of a swimming pool. 

In practice, in (O is thus corrupted by a complex gain 
matrix which includes both source direction dependent 
perturbations and electronic instrumentation gain errors. It is 
the objective of calibration to estimate this matrix and track its 
changes over the duration of the observation. Some corrections 
(e.g., the complex antenna gain variations) can be applied 
directly to the measured correlation data, whereas other correc- 
tions (e.g., the propagation conditions) are direction dependent 
and are incorporated in the subsequent imaging algorithms. 
Very often, the estimation of the calibration parameters is done 
in an iterative loop that acts on the correlation (visibility) 
data and image data in turn, e.g., the self-calibration (Self- 
Cal) algorithm [10], [11] discussed in more detail later in this 
paper. 

III. Telescope architectures 

The physical model underlying the array calibration depends 
on the instrument architecture. This architecture also deter- 
mines the capabilities of the telescope and may therefore have 



a profound effect on the calibration strategy, as we will see 
later on. 

The Westerbork Synthesis Radio Telescope (WSRT) and the 
Very Large Array (VLA) have been the work horses of radio 
astronomy since the 1970s. Both telescopes are arrays of 25 
m dishes. The size of a dish determines its beam width, or 
Field of View (FOV) at a given wavelength, and hence the size 
of the resulting image, while the spatial extent of the array 
determines the resolution within the FOV. The illumination 
pattern of the feed on each dish determines its sensitivity 
pattern, which is commonly referred to as the primary beam. 
These telescopes can also form an instantaneous beam within 
this primary by coherent addition of the telescope signals 
(beamforming). This beam is called the array beam. Visibilities 
are measured by correlating the telescope signals. The baseline 
vectors on which the visibility function is observed during a 
full observation describe a synthesized aperture. The sampling 
within this aperture determines the sensitivity pattern of the 
synthesis observation, which is referred to as the synthesized 
beam or point spread function (psf) and corresponds to the 
dirty beam in the previous section. We thus have a beam hier- 
archy from the primary beam, which has a relatively large FOV 
(degrees) and relatively low sensitivity, via the instantaneously 
formed array beam to the point spread function, which has a 
small FOV (arcseconds) and a high sensitivity. 

The WSRT and the VLA have their optimum sensitivity at 
frequencies of 1 to a few GHz. At lower frequencies, several 
things change. There are many more strong sources (e.g. 
synchrotron emission power is proportional to wavelength), 
thus even sources far outside the main beam of the psf may 
show their effect due to non-ideal spatial sampling. At low 
frequencies, the ionosphere is also much more variable (the 
phase delays are proportional to wavelength). Observations at 
these frequencies are therefore more challenging and require 
considerable processing power for proper calibration. High 
dynamic range imaging at these frequencies has therefore only 
recently been considered. 

In the Low Frequency Array (LOFAR) [12], [13], which 
is currently being built in The Netherlands, the dishes are 
replaced by stations, each consisting of many small antennas 
distributed over an area of about 100 x 100 meter. Some 
stations are very closely spaced, others are placed up to several 
100 km away from the core. A station is a phased array 
of receiving elements with its own beamformer. The stations 
are steered electronically instead of mechanically, which al- 
lows them to respond quickly to transient phenomena. The 
receiving elements can either be individual antennas (dipoles), 
or compound elements (tiles) consisting of multiple antennas 
whose signals are combined using an analog beamformer. 
This system concept introduces two additional levels in the 
beam hierarchy: the compound (tile) beam and the station 
beam. The complete hierarchy of beam patterns is illustrated in 
Fig. [5] The Murchison Widefield Array (MWA) has a similar 
design and purpose as LOFAR, but is placed in the outback 
of Western Australia and has a maximum baseline length of 
about 3 km. [14]. 

At first sight there is not much difference between the cal- 
ibration of an array of stations (or dishes) and the calibration 



ACCEPTED BY SPM, FINAL VERSION, SUBMITTED APRIL 2, 2010 



5 





Fig. 5. (Center column) (d) The beamforming hierarchy with the array beam produced by an array of stations at the top and the antenna beam at the bottom. 
Subsequent levels in the hierarchy have beams that are narrower and more sensitive. (Left column) (a) the corresponding concept layout of LOFAR, (b) a 
LOFAR low band antenna station and (c) a MWA tile. (Right column) (e) a concept for SKA consisting of an array of stations, each with small dishes, (/) 
a concept for the SKA core station, and (g) a SKA demonstrator tile consisting of Vivaldi antennas. 



of a station itself, but there are some subtle differences. A 
station only contains short baselines (~ 100 m), which implies 
that it provides a much lower resolution than an array, while 
its constituents provide a much wider FOV. As we will see 
in the next section, this implies that the calibration becomes 
more challenging due to direction dependent effects. Another 
challenging aspect stems from the enhanced flexibility of 
electronic beamforming: this may result in a less stable beam 
than a beam that is produced mechanically with a reflector 
dish. Finally, the output of a station beamformer is insufficient 
for calibration: for this purpose the station should also provide 
the correlation data among individual elements, even if these 
are not used at higher levels in the hierarchy. 

A compound element (tile) can also be exploited as focal 
plane array (FPA). In this case, the array is placed in the 
focal plane of a dish. This allows to optimize the illumination 
of the dish, as it effectively defines a spatial taper over the 
aperture of the dish that can be used to create lower spatial 
side lobes. An FPA can also improve the FOV of the dish 
by providing multiple beams (off-axis) on the sky. In this 



case, the primary beam is the sensitivity pattern produced 
if the dish is illuminated by only a single antenna of the 
compound element. The compound beam is the electronically 
formed beam produced by illuminating the dish by the FPA. 
The FPA concept is currently under study in The Netherlands 
[15], United States [16], [17] and Australia [18] as part of 
the technology road map towards the Square Kilometer Array 
(SKA) [19]. 

SKA is a future telescope that is currently in the concept 
phase. It is a wide band instrument that will cover the 
frequency range from 70 MHz to above 25 GHz. Apart from 
cost the design is driven by a trade-off between sensitivity 
and survey speed: the speed at which the complete sky can be 
observed. To enable wide band operation, it will probably use 
a mix of receiver technologies: 

• Dishes with wide-band single pixel feeds. At the highest 
frequencies this gives the highest sensitivity. Since this 
concept provides a very stable beam it is well suited for 
high fidelity imaging. 

• Dishes with focal plane arrays. At intermediate frequen- 
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cies FPAs can be used to enlarge the FOV of a single 
dish in a cost effective way. 
• Aperture arrays. At low frequencies, it is easier to obtain 
a large collecting area, hence sensitivity, by using dipoles 
instead of dishes. Aperture arrays have the additional 
advantage of being very flexible: by duplicating the 
receiver chains one can have multiple independent beams 
on the sky, bandwidth can be traded against the number of 
beams on the sky, and electronic beamforming provides 
a quick response time to transient events. 

The configuration of SKA is still under study. Current concepts 
include a dense core that contains, e.g., half of all receivers 
within a diameter of 5 km, stations consisting of aperture 
arrays out to a maximal baseline of 180 km, and dishes out 
to a maximal baseline of 3000 km. 

IV. Calibration scenarios 

In the previous section, we introduced several telescope 
architectures, each with different characteristics, hierarchy, and 
parameterizations of the observed data model. This will lead 
to a wide variety of calibration requirements and approaches. 
Fortunately, it is possible to discuss this in a more structured 
manner by using only four different scenarios [20], each 
of which can be described by a distinct specialization of 
the measurement equation [21]. Each scenario considers the 
calibration of an array of elements with complex gain varia- 
tions and spatially varying propagation effects. The scenarios 
compare the array aperture (the length of the largest baseline) 
to the field of View (FOV, the beamwidth of each individual 
array element) and the isoplanatic patch size, i.e., the scale at 
which the ionosphere/troposphere can be considered constant. 

Scenario 1. As shown in Fig. [6] (top left), the receiving 
elements of the array have a small FOV and the maximum 
baseline is short. In this case, all receiving elements and all 
lines of sight within the FOV experience the same propagation 
conditions: the propagation effects do not distort the image. 
This scenario represents the case in which direction dependent 
effects do not play a significant role. The calibration routine 
can therefore focus completely on element based gain effects. 
Since the FOV is small, it is often possible to calibrate on 
a single strong source in the FOV, especially if the array 
elements can be steered to a nearby calibration source. Due to 
its simplicity, this scenario is often used to obtain a first order 
calibration for new instruments. 

Scenario 2. In this scenario (Fig. [6l top right), we have a 
large array consisting of elements with a small FOV. Lines 
of sight from different elements towards the region of interest 
are subject to different propagation conditions, but the prop- 
agation conditions for all lines-of- sight within the FOV of 
an individual element are the same. The propagation effects 
can therefore be merged with the unknown receiver gains of 
each element, and the array can be calibrated under the same 
assumptions as in the first scenario. This scenario is valid for 
most of the interferometers built in the 1970s and 1980s, such 
as the WSRT and the VLA, and for VLBI observations. 

Scenario 3. Fig. [6] (bottom left) depicts the third scenario 
in which the elements have a large FOV, but the array is 



small. This implies that all lines of sight go through the 
same propagation path, but that there may be considerable 
differences in propagation conditions towards distinct sources 
within the FOV. The ionosphere and troposphere thus impose 
a direction dependent gain effect that is the same for all 
elements. This scenario can also handle instrumental effects 
that are the same for all elements (e.g., irregular antenna 
beamshapes) and is therefore well suited for the situation of 
a compact array of identical elements such as the MWA and 
a single LOFAR or SKA station. 

Scenario 4. As shown in the bottom right panel of Fig. O 
the elements have a large FOV and the array has a number 
of long baselines. The lines of sight towards each source 
may experience propagation conditions that differ for different 
elements in the array. This implies that distinct complex gain 
corrections may be required for each source and each receiving 
element. Calibration is not possible without further assump- 
tions on stationarity over space, time and/or frequency. This 
is the most general scenario, and valid for future telescopes 
such as LOFAR, SKA and ALMA. 

V. Array calibration 

Scenario 1 

In scenario 1 , the field of view (FOV) of each array element 
(dish) is small and it is reasonable to assume that there is only 
a single calibrator source within the beam. Often, the beam 
will even have to slightly point away from the field of interest 
to "catch" a nearby strong calibrator source. The calibrator 
should be unresolved, i.e., appear as a point source, as opposed 
to the extended structure visible in Fig. [T] 

We will assume that the STI sample covariance matrices 
are calibrated independently and omit the subscript m 
for notational convenience. Continuity over many STIs needs 
to be exploited under scenario 4 and will thus be discussed 
later. The data model (measurement equation) under scenario 
1 is given by 

R = GKE.K^G^ + En (3) 

where G = diag(g) is a diagonal matrix. For a single 
calibrator source, K has only a single column representing 
the geometric phase delays of the array towards the source, 
and Eg = cr^ is a scalar with the source power. Both the 
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Fig. 6. Calibration scenarios 1 through 4 (top left to bottom right) as defined 

direction of the source and its power are known from tables. 
Thus, in essence the problem simplifies to 

R = gg^ + En 

This is recognized as a "rank-1 factor analj^sis" model in 
multivariate analysis theory [22], [23]. Given R, we can solve 
for g and T,n in several ways [24]-[26]. E.g., any submatrix 
away from the diagonal is only dependent on g and is rank 
1: this allows direct estimation of g. In the more general case 
described by ([3]), multiple calibrators may be simultaneously 
present. The required multi-source calibration is discussed in, 
e.q., [27]-[30]. 

Scenario 2 

In this scenario, the ionospheric or tropospheric phases are 
different for each array element, but the field of view is 
narrow, and it is possible to assign the unknown ionospheric 
or tropospheric phases to the individual antennas. Thus, the 
problem reduces to that of Scenario 1 , and the same calibration 
solution applies. 




by Lonsdale [20] 
Scenario 3 

This scenario is relevant for sufficiently compact arrays, 
e.g., the calibration of a SKA or LOFAR station or an array 
with a relatively small physical extend like the MWA. The 
phase and gain of the station beam FOV is direction dependent 
but the array elements see the same ionosphere. It is possible 
to make a coherent image, but sources may have shifted to 
different locations. 

Since the FOV is large, several calibrator sources (say Q) 
will be visible. The model given by ^ for scenario 1 can 
be extended to include unknown source dependent complex 
gains, 

R=(GiKG2)5].(GiKG2)^ + 5]n , 

where Gi = diag(gi) represents the antenna gain, and G2 = 
diag(g2) the source dependent complex gains, which describe 
the antenna beam shape and the propagation conditions. In this 
model, we can merge the unknown G2 with Xlg to obtain a 
single unknown diagonal source matrix 5] = 02^3^2 , i-^-' 

R = GKEK^G^ + En 
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Ionospheric calibration based on a statistical model 

Assume for simplicity a single calibration source at 
zenith. The data model is 

R = aa^cr^ + all 

a is the spatial signature of the source at frequency fk, as 
caused (only) by the ionosphere, and given by 

a = exp(j(/)) , 4> = Crf-^ 

where is a vector with J entries representing the 
ionospheric phases at each station, vector r contains the 
Total Electron Content (TEC) seen by each station, and 
C is a constant. The TEC is the integral of the electron 
density along the line of sight, and is directly related to a 
propagation delay. 

The ionosphere is often modeled as a turbulent slab of 
diffracting medium. Assuming a single layer and a pure 
Kolmogorov turbulence process, the covariance for r is 
modeled by a power law of the form 

= I - a(D)®^ 

with unknown parameters a and j3 (theoretically, /3 = 5/3 
but measured values show that deviations are possible). 
The matrix D contains the pairwise distances between all 
antennas, and is known. 
Write the model as 

vec(R) = (a (g) aja^ + vec(I)cr^ 

The observed covariance matrix is vec(R) = vec(R(r)) + 
w, where the observation noise w has covariance Cw = 
-^(R^ R). At this point, we could estimate r by a 
Least Squares model matching. However, we have a priori 
knowledge on the parameters r, i.e., the covariance C^-. 
The maximum a posteriori (MAP) estimator exploits this 
knowledge, and leads to 

f = argmin||Cw*vec(R-R(r))f + ||C;V||^ 
r 

This is solved as a nonlinear least squares problem {a and 
P are estimated as well). 



The dimensionality can be reduced by introducing r = U^, 
where U contains a reduced set of basis vectors. These can 
be 

• Data independent, e.g., simple polynomials, or Zernike 
polynomials [31], 

• Data dependent (Karhunen-Loeve), based on an eigen- 
value decomposition of C^-: C^- ~ UAU^. Only the 
dominant eigenvectors are retained. 

Selection of the correct model order is often a tradeoff 
between reduced modeling error and increased estimation 
variance. Indeed, the simulation below indicates that the 
Least Squares estimator (with either a Zernike basis or a 
Karhunen-Loeve basis) has an optimal model order, beyond 
which the mean squared estimation error increases. The 
MAP estimator adds an additional term to the cost function 
that penalizes "weak" parameters and makes it robust to 
overmodeling. For more details, see [32]. The validity of 
the turbulence model is experimentally tested on VLA 
survey data in [33], [34]. 




Model order 



Estimation performance as function of model order selection 



Given R, the objective is to estimate G, X) and E^. Here, K is 
known as we know the source locations. If there is significant 
refraction, each viewing direction may pass through a different 
phase wedge, causing direction dependent motion of sources 
(but no further deformation). In that case, we will also have 
to include a parametric model for K and solve for the source 
directions (DOA estimation). This scenario is treated in, e.g., 
[30], [35]. 



The most straightforward algorithms to solve for the un- 
knowns are based on alternating least squares. Assuming that 
a reasonably accurate starting point is available, we can solve 
G, and 5]^ in turn, keeping the other parameters fixed at 
their previous estimates [30]: 



• Solve for instrument gains: 

g = argmin||R - G(K5]K^)G^ - T^nf 
g 

= argmin||vec(R — 5]^)+ 
g 

-diag(vec(Ro))(g(g)g)|p 

where Rq = KXIK^. This problem cannot be solved 
in closed form. Alternatively, we can first solve an 
unstructured problem: define v = g (g) g and solve 

V = diag(vec(Ro))~"^vec(R — 5]^) 

or equivalently 

g^ = (R-En)0Ro. 
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where denotes a pointwise division. After this, we 
can do a rank-1 approximation to find g. The pointwise 
division can lead to noise enhancement; this is remediated 
by only using the result as initial estimate for, e.g., Gauss- 
Newton iteration [28] or by formulating a weighted least 
squares problem instead [26], [30]. 

• Solve for source powers a = diag(5]).- 

a = argmin||R - GKEK^G^ - "EnWl 
a 

= argmin||vec((R - E^) - (GK)5](GK)^)|p 
a 

= argmin||vec(R - E^) - (GK) o (GKjtrf 
a 

= (GKoGK)tvec(R-5]n) 

• Solve for noise powers cr^ = diag(5]n)-' 

&n = argmin||R - GKEK^G^ - E^lP 

= diag(R - GKEK^G^). 

A more optimal solution can be found by covariance match- 
ing estimation, which provides an asymptotically unbiased 
and statistically efficient solution [36]. However, there is no 
guarantee that this will hold for a weighted alternating least 
squares approach. Fortunately, the simulations in [30] suggest 
that it does for this particular problem, even if the method is 
augmented with weighted subspace fitting [37], [38]. 

The first step of this algorithm is closely related to the self- 
calibration (SelfCal) algorithm [10], [11] widely used in the 
radio astronomy literature, in particular for solving Scenario 
1 and 2. In this algorithm, Rq is a reference model, obtained 
from the best known map at that point in the iteration. 

An alternative implementation is Field-Based Calibration 
[39]. Assuming the instrumental gains have been corrected 
for, an image based on a short time interval is made. The 
apparent position shifts of the strongest sources are related to 
ionospheric phase gradients in the direction of each source. 
These "samples" of the ionosphere are interpolated to obtain 
a phase screen model over the entire field of view. This can 
be regarded as image plane calibration. The method is limited 
to the regime where the ionospheric phase can be described 
by a linear gradient over the array. 

For the MWA currently a real time calibration method based 
on "peeling" is being investigated [40]. In this method of 
successive estimation and subtraction calibration parameters 
are obtained for the brightest source in the field. The source 
is then removed from the data, and the process is repeated for 
the next brightest source. This leads to a collection of samples 
of the ionosphere, to which a model phase screen can be fitted. 

Scenario 4 

This scenario is the most general case and should be applied 
to large arrays with a wide field of view such as LOFAR 
and SKA. In this case, each station beam sees a multitude 
of sources, each distorted by different ionospheric gains and 
phases. The data model for the resulting direction-dependent 
calibration problem is 

R = A5],A^ + 5]n = (G0K)E,(G0K)^ + En, 



where G = [gi, • • • ,gQ] is now a full matrix; Eg and K 
are known. G and Tin are unknown. Without making further 
assumptions, the solution is ambiguous: the gains are not 
identifiable. 

This problem is discussed in [41] and studied in more detail 
in [42]. Possible assumptions that may lead to identifiability 
are: 

• Bootstrapping from a compact core. The planned geome- 
try of LOFAR and SKA includes a central core of closely 
packed stations. Under suitable conditions, these can be 
calibrated as under Scenario 3, giving a starting point for 
the calibration of the other stations. 

• Exploiting the different time and frequency scales. Sup- 
pose we have a number of covariance observations R/c,m, 
for different frequencies fk and time intervals m. The 
matrix K = K^^^ is varying over frequency and time, 
whereas the instrumental gains are relatively constant. 
This can be exploited to suppress contaminating sources 
by averaging over K/e,m while correcting the delays 
towards the calibrator sources. 

• Modeling the gain matrix G = Gk.m- The gain matrix 
can be approximated by a low order polynomial model 
in k and m, leading to a reduction in the number of 
unknowns. As basis functions for the polynomials we 
can use the standard basis, or Zernike polynomials (often 
used in optics), or a Karhunen-Loeve basis derived from 
the predicted covariance matrix (see Box "Ionospheric 
calibration"). 

• Successive estimation and subtraction or ''peeling''. In 
this method a distinction is made between the instrumen- 
tal gains and ionospheric gains based on considerations 
such as temporal stability and frequency dependence. 
Sources are estimated and removed from the data in an 
iterative manner. This leads to a collection of samples to 
which a global model of the ionosphere or station beam 
can be fitted. 

A complete calibration method that incorporates many of the 
above techniques was recently proposed in [43], and was 
successfully tested on a number of 74 MHz fields observed by 
the VLA. The behavior of this method at lower frequencies 
and / or on baselines longer than a few tens of kilometers still 
needs investigation. 

VI. Compound element calibration 

Compound elements are used in very large aperture arrays 
to keep the number of correlator inputs manageable, and as 
focal plane arrays to increase the FOV of dishes. In either case, 
each compound element produces a superposition of antenna 
signals, x(n), at its output port y{n) during normal operation, 
i.e., 

y{n) — w^x(n) 

where w are the beamformer weights. Compound elements 
therefore require a separate calibration measurement before or 
after the observation, as only y{n) is available and no antenna 
specific information can be derived from this superposition. 
Compound elements should thus be designed to be stable over 
the time scales of a typical observation. 
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Initial system characterization is often done in an ane- 
choic chamber. In these measurements, the response y{n) 
of the compound element to a test probe is recorded, 
while varying the beamformer w [44]. The measurements 
^2(^)5 • jUNi'Ti) are stacked in a vector y while 
the corresponding weights v^i, W2, • • • , wat are stacked in a 
matrix W. The complete series can be summarized as 

y(n) = W^Gks{n) 

where s{n) is the known input signal, k is the phase vector 
describing the geometrical delays due to the array and source 
geometry and G = diag (g) contains the instrumental gains. 
The gains of the individual elements stacked in g are then 
easily derived. An attractive feature of this method is that it can 
also be used in the field using a stationary reference antenna 
[45]. 

Calibration of an aperture array of compound elements is 
discussed in the previous section. Regarding its use as a focal 
plane array in a dish, there are a number of differences, mostly 
because the dish projects an image of the sky within its FOV 
onto the FPA, whereas an aperture array measures the complex 
field distribution over the aperture itself. 

The goal of FPA beamforming is to maximize the signal- 
to-noise ratio in an observation [16]. This involves a trade-off 
between maximizing the gain in the direction of the source 
and minimization of the total noise in the system. A very 
intuitive approach to maximize the gain towards the source 
is conjugate field matching. For this calibration method, the 
array response is measured for a strong point source for 
each of the P compound beams that will be formed by the 
FPA. The weights of the p-th compound beam are chosen such 
that Wp = dip. Since the dish forms an image of the point 
source on the FPA, most of the energy is concentrated on a 
few elements. Conjugate field matching thus assigns very high 
weights to a few elements and the noise of these elements 
will therefore dominate the noise in the observation. If one 
of these elements has a poor noise performance, conjugate 
field matching does not lead to the maximum signal-to-noise 
ratio in the observation. The measurement on a strong point 
source should therefore be augmented with a measurement on 
an empty sky to obtain the noise covariance matrix of the 
array. This step allows proper weighting of the receiver paths 
to optimize the signal-to-noise ratio of the actual observation. 
An excellent overview of FPA signal processing is provided 
in [17]. 

VII. Future challenges 

Instruments like LOFAR and SKA will have an unprece- 
dented sensitivity that is two orders of magnitude higher 
(in the final image) than current instruments can provide. 
The increased sensitivity and large spatial extent requires 
new calibration regimes, i.e., scenarios 3 and 4, which are 
dominated by direction dependent effects. Research in this 
area is ongoing. Although many ideas are being generated, 
only a limited number of new calibration approaches have 
actually been tested on real data. This is hardly surprising 
since only now the first of these new instruments are producing 



data. Processing real data will remain challenging and drive 
the research in this area of signal processing. Apart from the 
challenges discussed already throughout the text, some of the 
remaining challenges are as follows. 

• The sky. Because of the long baselines that are part of the 
new instruments, many sources which appear point- like 
to existing instruments, will be resolved. This means that 
they cannot be treated as point sources, but should be 
modeled as extended sources using, e.g., shapelets [46]. 
The new instruments will have wide frequency bands, so 
that the source structure may change over the observing 
band. For these reasons the source models will have to be 
more complicated than currently assumed. At the same 
time, due to the increased sensitivity, many more sources 
will be detected and will have to be processed. This will 
not only affect the calibration of the instruments, but also 
the imaging and deconvolution. 

Because of their high sensitivity the new instruments are 
capable of detecting very weak sources, but they will have 
to do so in the presence of all the strong sources already 
known. Some of those strong sources may not even be in 
the FOV, but may enter through the primary beam side 
lobes. 

• The instrument. Both aperture arrays and dishes with 
FPAs have primary beams that are less stable than single 
pixel primary beams from dishes. The primary beam is 
time dependent (if it is not fixed on the sky) and varies 
with frequency and over the different stations or FPA 
systems. These beam pattern variations have a negative 
impact on the achievable image quality. Calibrating for 
these beams is a real challenge, as is the correction for 
such beams during imaging. 

• The atmosphere. For low frequency instruments, iono- 
spheric calibration is a significant challenge. Current 
algorithms have been shown to work for baselines up 
to a few tens of kilometers and frequencies as low as 
74 MHz. However, for baselines of a few hundreds up to 
a few thousands of kilometers and frequencies down to, 
say, 10 MHz these algorithms may not be valid. 

• Polarization purity. Calibration and imaging have to take 
the full polarization of the signal into account. The 
primary beam of an instrument introduces instrumental 
polarization due to the reception properties of the feeds. 
If the feeds do not track the rotation of the sky, as is the 
case in any radio telescope that does not have an equato- 
rial mount, the instrumental polarization varies over the 
observation. The ionosphere alters the polarization of the 
incoming electromagnetic waves as well due to Faraday 
rotation. These effects require calibration and correction 
with high accuracy. 

• The large number of elements. Classical radio telescopes 
have at most a few tens of receivers (WSRT has 14 dishes 
and the VLA has 27). Instruments like LOFAR, MWA 
and EMBRACE will have about 10^ receiving elements 
while SKA is envisaged to have over 10^ signal paths. 
The corresponding increase in data volumes will require 
sophisticated distributed signal processing schemes and 
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algorithms that can run on suitable high performance 
computing hardware. 

• Equations and unknowns. It is clear that in order to 
deal with these challenges more complicated models are 
needed, which in turn contain more unknowns that need 
to be extracted from the data. The increase in the number 
of stations will yield more equations, but this may not 
be enough. Modeling of the time-frequency dependence 
of parameters by a suitable set of basis functions will 
decrease the amount of unknowns that need fitting. 

• Interference mitigation. The radio frequency spectrum is 
rather crowded, and it is expected that many observations 
will be contaminated by (weak or strong) RFI. Array 
signal processing techniques can be used to suppress 
interference, e.g., by active null steering or covariance 
matrix filtering [47], [48]. For LOFAR and SKA, no tech- 
niques have been proposed yet. Research from cognitive 
radio and compressive sensing may be very relevant for 
interference avoidance. 

Finding suitable answers to these challenges will be of 
critical importance for the next generation of instruments. 
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